====== statistic() ======

''mixed **statistic**(string //statistic//, array|string //variables//, mixed //option//, [boolean //alldata//])''

The function statistic() can determine specific univariate data from the data record (across all previous questionnaires).


  * //statistic//\\ Which statistic should be calculated?
    * '''count''' -- counts the frequency of the value specified as ''//option//''.
    * '''percent''' -- percentage of the value specified as ''//option//''.
    * '''crosscount''' -- counts the frequency of the joint occurrence of two values in two variables. The two variables should be specified as an array (or separated with a comma), as well as their values that are specified as ''//option//''.
    * '''mode''' -- most commonly occurring value. 
    * '''min''' -- lowest value.
    * '''max''' -- highest value.
    * '''mean''' -- arithmetic mean of the values.
  * //variables//\\ Determines which variable(s) the statistic should be calculated for. The IDs of the individual variables can be found in the **Variables Overview**. If the statistic requires multiple variables, these can be given as a comma-separated string or as an array.
  * //option//\\ Some statistics call for or allow a third entry which is set with this parameter (see below).
  * //alldata//\\ This entry is optional and determines that all questionnaires be entered into the statistics; not just those that have been completed. 

**Note:** If ''true'' is not explicitly specified for the parameter //alldata//, only completed questionnaires are included when calculating the statistical values.

**Note:** Test data collected during the developing of the questionnaire and pretesting is only included if the current questionnaire is a part of the test as well. If the questionnaire is being carried out as part of the regular data collection, ''statistic()'' only counts data from the regular data collection.


===== Frequency Count =====

When counting the frequency (''count''), a third argument can be specified: which value the frequency should be determined for. If a third value is not given, the number of valid responses is output. Missing data is not counted. 

For example, in the questionnaire there is a question where the respondent selects their gender (1=female, 2=male, -9=no input). The number of women who entered the third value ''1'' can be determined like so: 

<code php>
$numberwomen = statistic('count', 'SD01', 1);  // frequency of women (1)
$numbermen = statistic('count', 'SD01', 2); // frequency of men (2)
$numbercompleted = statistic('count', 'SD01');    // number of valid data 
$numberall = statistic('count', 'SD01', false, true); // all data records
html('
  <p>So far,'.$numberall.' people
  specified their gender in this survey, but the questionnaire was
  only completed in '.$numbercompleted.' cases.</p>
  <p>The questionnaires completed are made up of '.
  $numberwomen.' women and '.
  $numbermen.' men.</p>
');
question('SD01');  // question about the respondent's gender
</code>


===== Multivariate Frequency =====

The '''crosscount''' statistic counts the cases (like in cross-tabulations) in which multiple variables apply. 

Instead of a single variable, two or more variables are specified as an array or separated with a comma ('',''). The values being counted for each variable are specified as the third parameter //option//. Only cases which have specified the first value for the first variable, the second value for the second variable and so on are counted. 

<code php>
$nYoungFemale = statistic('crosscount', 'SD01,SD02', '2,1');  // variables and values in a list with commas ...
$nGrownFemale = statistic('crosscount', array('SD01','SD02'), array(2,2));  // ... or in arrays
html('
  <p>So far, '.$nYoungFemale.' people have stated in this survey 
  that they are female and in age group 1 (up to 18 years old).
  '.$nGrownFemale.' women stated they were older than 19 years old.</p>
');
question('SD01');  // question about the respondent's gender
question('SD02');  // question about the respondent's age
</code>


===== Valid Percent =====

The output is the percentage of a value within all valid data. The value to be counted must be given as the third argument. 

<code php>
$numberwomen = statistic('percent', 'SD01', 1); // percentage of women
html('
  <p>So far, '.
  $numberwomen.' women have taken part in this survey.</p>
');
question('SD01');  // question about the respondent's gender
</code>


===== Mode: Value that Occurs Most Frequently =====

This returns the value that has been selected most frequently so far. If multiple values have been selected equally often then these are returned separated by a comma. 

As a third argument (in this instance a Boolean), it is possible to specify if invalid values (no answer etc.) should also be counted.

<code php>
$mode = statistic('mode', 'AB01_02', true);
$modes = explode(',', $mode);  // separate multiple values
if (count($modes) > 1) {
  // multiple values stated most frequently
  html('
    <p>Multiple answers were selected equally often.</p>
 ');
} else {
  // answer options text (statistic() only provides the numeric code)
  $text = getValueText('AB01_02', $mode);
  html('
    <p>The most common answer for this question was: '.$text.'.</p>
  ');
}
</code>


===== Min, Max and Mean of the Valid Data =====

The statistics '''min''', '''mean''' und '''max''' only calculate a correct value if numerical values exist for the question. Data in a text input is ignored if it is not a number -- unless is it is specified that invalid values should also be entered into the statistics (''true'') as the third parameter. 

If no valid values are available, 0 is returned as the '''mean'', and the value ''false'' as the ''min'' and ''max''. 

<code php>
$min = statistic('min', 'BB01_03');
$max = statistic('max', 'BB01_03');
$mean = statistic('mean', 'BB01_03');
html('
  <p>The participant has given the programme
  an average rating of '.$mean.' so far.</p>
  <p>The ratings lie between '.$min.' und '.$max.'.</p>
');
</code>